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Abstract 

The growth dynamics of complex organizations have attracted much interest of econophysicists and sociophysicists 
in recent years. However, most of the studies are done for developed countries. We investigate the growth dynamics 
of the primary industry and the population of 2079 counties in mainland China using the data from the China County 
Statistical Yearbooks from 2000 to 2006. We find that the annual growth rates are distributed according to Student's t 
distribution with the tail exponent less than 2. We find power-law relationships between the sample standard deviation 
of the growth rates and the initial size. The scaling exponent is less than 0.5 for the primary industry and close to 0.5 
for the population. 
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1. Introduction 

The law of proportionate growth, also known as the law of proportionate effect or Gibrat's law, is an important 
fundamental regularity in industry economics QJJ] . Numerous studies have been conducted by economists and econo- 
physicists on the growth dynamics of firms in a long history. Denoting S (?) the size of a firm at time t, its logarithmic 
growth rate is defined as 

r(f) = \nS(t)-lnS(t- 1). (1) 

The basic assumptions of Gibrat's law state that (1) the growth rate r{t + 1) of a firm is independent of its size S(t), 
(2) the successive growth rates r(t) are temporally uncorrected, and (3) the growth rate distribution f(r) is Gaussian. 
It follows immediately that the firm size distribution is log-normal. 

However, there is undoubtable empirical evidence showing that Gibrat's law does not hold. Using data of all 
publicly traded USA manufacturing firms within the years 1974-1993, it is found that the distribution f(r\S) of the 
annual growth rates of firms with approximately the same size S has an exponential form 



f(r\S ) 4 f(r(t)\S (t- 1)) = — exp 

V2cr(S) 



( V2|r- r(S)\) 



(2) 



o-(S) 

and its standard deviation cr(S ) scales as a power law with the firm size S 

cr(S)~S-P, (3) 

where the scaling exponent (3 is about 0.20 JHIi where the exponent (3 is related to the fluctuation exponent H by 
P — 1 - H [4]. Further investigation finds that the growth rates are distributed exponentially in the bulk followed by 
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power-law tails J^, 

f(r\S) ~ Irpf (4) 

where £ « 3. 

Very similar statistical regularities of the growth rates have been unveiled in diverse fields including other business 
firms O-Qli, gross domestic product (GDP) JTl- 131. scientific output lfl4l [l5ll . human and bird population lfl5l - 

TZL .... .. trx. . „. ...... 



1711 . research and development expenditure |14j], and so on. The growth dynamics of these complex systems share 
striking similarity indicating that their underlying interactions of subunits might exhibit universal behaviors. Distinct 
numerical and analytic models have been proposed to understand the growth dynamics of complex organizations 

In this work, we investigate the annual growth rate distributions of the primary industry and the population of 2079 
counties in mainland China within seven years, which are retrieved from the China County Statistical Yearbook from 
2000 to 2006. The data sets are briefly described in Section[2] Section [3] studies the growth rate distributions of the 
two variables and Section [4] studies the size dependence behavior of standard deviations of the growth rates. Section 
[5] summarizes. 



2. Data sets 

The reform and opening-up policy has greatly affected China's economy and people's living standards since 1978. 
China has made great achievements in the economic development in the past 30 years with the average annual growth 
rate of GDP being about 9.8%, which is much higher than the world average annual growth rate around 3.3% in 
the same period. Compared with the beginning of the reform and opening-up, the GDP per capita is more than 2.4 
thousand U.S. dollars with an annual growth of 8.2%. It is thus interesting to investigate the growth dynamics of 
socio-economic variables of China. 

The population and primary industry data analyzed in this work were collected from the China County Statistical 
Yearbooks from 2000 to 2006. The yearbooks were edited by the National Bureau of Statistics for the rural socio- 
economic survey and published by China Statistical Publishing House, which record the statistical information of the 
socio-economic development of the counties in mainland China, including economy, agriculture, industry, investment, 
education, health, and so on. The information in the yearbook covers all the counties except those in Hong Kong S AR, 
Macao SAR, and Taiwan Province. There are 2079 counties in total. In the yearbook, the total annual population 
represents the number of people in a certain period and in a certain area. The end annual population refers to the 
number of people just before December 3 1 each year in a certain area. Primary industry is the sum of agriculture, 
forestry, animal husbandry and fishery. 

The basic statistics of the data sets are presented in TableQ] We find that the values of kurtosis are all greatly larger 
than 3 for both the primary industry and the population from the year 2000 to 2006, which implies that the distributions 
have fat tails. On the other hand, the values of skewness are all negative which means that the distributions are 
asymmetric and skewed to the left. We standardize the growth rates to make r have zero mean and unit variance. 



Table 1 : Basic statistics of the growth rates of the primary industry and the population of 2079 counties in mainland China. 



Year 


Primary Industry 




Population 




Max Min Mean Std Skew 


Kurt 


Max Min Mean Std Skew 


Kurt 



2001 0.94 -3.52 0.01 0.22 -6.29 77.19 0.94 -1.41 0.00 0.08 -4.44 153.76 

2002 1.07 -2.26 0.06 0.17 -1.60 48.98 0.34 -0.51 0.00 0.04 -1.13 38.62 

2003 1.06 -2.27 0.05 0.17 -2.53 33.10 0.71 -2.40 0.00 0.09 -12.76 333.23 

2004 1.08 -5.91 0.14 0.38 -11.21 156.39 0.92 -0.76 0.01 0.06 2.09 114.95 

2005 0.91 -2.29 0.08 0.18 -5.99 77.06 0.65 -2.51 -0.00 0.10 -17.45 410.42 

2006 0.90 -6.54 0.07 0.19 -25.23 871.57 0.69 -2.40 0.00 0.09 -15.24 348.97 
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3. Probability distributions of the standardized growth rates 



3.1. Unconditional distributions 

Figure[T]illustrates the empirical probability distribution functions f(r) of the standardized annual growth rates of 
the primary industry and the population of 2079 counties in mainland China. We find that Student's t distribution fits 
the data well, which reads 



f(r) = 



[v + L(r - m) 2 ] VI , 



(5) 



where v is the degrees of freedom parameter, m is the location parameter and L is the scale parameter, B(a, b) = 
T(a)T(b) /F(a + b) is the Beta function, where F(-) is the Gamma function. Since the growth rates are standardized, we 
have m = 0. Using the least-squares fitting method, we obtain v = 1.51 and L = 16.02 for the primary industry growth 
rate and v = 1 .25 and L = 86.01 for the population growth rate. We note that Student's t distribution is in essence the 
g-Gaussian distribution 13111 . which finds its applications in financial returns |32|438I1 . 




Figure 1: Empirical probability density functions f(r) of the standardized annual growth rates r for the primary industry (a) and the population (b). 
The solid lines are Student's t distributions fitting to the data with parameters v = 1.51 and L = 16.02 for the primary industry and v = 1.25 and 
L = 86.01 for the population, respectively. The dot-dashed lines are the fits to Eq. from Ref. [J. 



In order to test whether the annual growth rates r of the primary industry and the population are drawn from 
Student's t distributions, we use the Kolmogorov-Smirnov (KS) test, where the KS statistic is defined as 

KS = max\F-P\, (6) 



where F is the cumulative distribution of the best fit and P is the cumulative distribution of the empirical or synthetic 
data. We generate 1000 sequences from the fitted Student distribution and calculate the KS statistic for each sequence. 
The p-value is then calculated as the percentage that the KS statistic of the synthetic sequences is greater than that 
of the original sequence. We find that the p-values are 0.94 for the primary industry data and 0.80 for the population 
data, which implies that the hypothesis that the annual growth rates r for both the primary industry and the population 
follow the t distribution cannot be rejected. 

For comparison, we also adopt the theoretical form derived for firm growth ^ 



Vr 2 +2V(|r| + Vr 2 + 2Vj 

The resulting fits are also illustrated in Fig.Q] The KS tests unveil that the p-values are 0.90 and 0.00 for the primary 
industry and the population, respectively. Therefore, the distribution (Q) can be used to model the growth rates of the 
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primary industry but not for the population. This finding is rational since the growth of primary industry is expected 
to have a more similar mechanism as firms than the growth of population. 

We also use an alternative goodness-of-fit measure based on the CvM statistic Elfin, which is defined as: 



W 2 



f 

•J — D 



[F N (x) - F(x)YdF(x), 



(8) 



where is the empirical distribution function of the sample and F(x) is the a specified continuous distribution. We 
find qualitatively the same results as the KS test. Therefore, we find that the Student distribution is a better model 
for the growth rates of the primary industry and the population. We note that it is impossible to provide a proof for a 
specific distribution [42]. Additionally, a reasonable quantification of the distribution tail behavior is difficult for such 
a limited data set. 



3.2. Conditional distributions 

We further investigate whether the distributions of the growth rates are dependent on the initial values of the 
primary industry and the population. The entire growth rates of the primary industry are first arranged in an ascending 
order then grouped into two bins with identical number of data, that is, 1.6 x 10 6 < S ^ 4 x 10 8 (small bin) and 
4x1 8 < S < 3.6xl0 9 (large bin), where S is the value of primary industry. For each bin we calculate the 
conditional probability density function f(r\S). The results are shown in Figure |2a). We find that the data in each 
bin follows the t distribution with different parameters. Using the least-squares fitting method, we have v = 1.26 and 
L — 19.33 for the small bin of primary industry and v = 1 .85 and L — 11.16 for the large bin. Similarly, we divide the 
population growth rates into two bins, 0.95 x 10 4 < S < 5 x 10 5 (small bin) and 5 x 10 5 < S ^ 2.5 x 10 7 (large bin). 
We find that the data in the two bins obey the t distribution, which are presented in Figure EJb), with the parameters 
v = 1.27 and L = 37.78 for the small bin and v = 1.73 and L = 168.55 for the large bin. 
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Figure 2: (Color online) (a) Plot of empirical conditional probability density function f(r\S) for two bins of primary industry. The solid lines are 
the t distributions fitting to the data in each bin with the parameters v = 1 .26 and L = 19.33 for the small bin of primary industry and v = 1 .85 and 
L = 11.16 for the large one. (b) Plot of empirical conditional probability density function f(r\S ) for two bins of population. The solid lines are the 
t distributions fitting to the data in each bin with the parameters v = 1.27 and L = 37.78 for the small bin of primary industry and v = 1.73 and 
L = 168.55 for the large one. 



3.3. Power-law tails 

In the preceding two subsections, we have seen that the growth rates have power-law tails for both the primary 
industry and the population. Here we further investigate the tails of the complementary cumulative distributions 
C(r) = Pr(^ r) of the whole data sets in Section [XT] and the subgroups in Section [3~2l The empirical complementary 
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cumulative distributions are illustrated in Fig. [3] Eye-balling unveils that there are power-law tails in these empirical 
distributions 

C(r) ~ r~\ (9) 
where the tail exponent a can be determined in a statistical way. 
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Figure 3: Empirical complementary cumulative distribution functions C(r) for the primary industry and the population: (a) all sizes, (b) small sizes, 
and (c) large sizes. 



In order to determine the tail exponents, we adopt a nonparametric maximum likelihood estimation method pro- 
posed by Clauset, Shalizi and Newman (CSN) 14211 . The CSN method fits the power-law distribution to the data, along 
with the goodness-of-fit based approach to estimating the lower cutoff .Y m ; n for the scaling region. Having x m j n enables 
us to determine the percentage of the data points in the tails that are included in the fitting. The p-values are also 
calculated based on the Kolmogorov-Smirnov test for the power-law model (O. The results are presented in Table 
|2] We find that all the tail exponents lie in the interval (1,2), which means that the distributions are within the Levy 
regime 14311 . For comparison, we also list in Table [2] the values of v in the t distribution. It is found that these values 
are also less than 2 and the two sets of exponents estimated from the two methods are consistent with each other. This 
finding implies that the growth dynamics of the primary industry and the population studied in this work fall in to a 
distinct regime that is different from the growth dynamics of firms where the tail exponent is£-1^2||5j,|6|]. 



Table 2: Estimation of the tail exponents using the nonparametric approach of Clauset, Shalizi and Newman |43l with comparison to the fitted 
values using the ; distribution j5J- The second row of the CSN method gives the percentages of the data points included in the fitting. 



Method 


Parameters 


Primary Industry 




Population 




All 


Small 


Large 


All 


Small 


Large 






1.10 


0.38 


0.98 


0.26 


0.26 


0.39 


CSN 


pctg% 


8.08 


30.24 


9.10 


20.22 


31.29 


19.94 




a 


1.58 


1.20 


1.87 


1.16 


1.15 


1.45 


t distribution 


V 


1.51 


1.26 


1.85 


1.25 


1.27 


1.73 


L 


16.02 


19.33 


11.16 


86.01 


37.78 


168.55 



4. Size-dependent standard deviation 

In this section, we analyze the relationship between the standard deviation <x of growth rate and the initial size 
S used to calculate the growth rate. Unfortunately, as shown in Section [3] the degree of freedom v in Student's t 
distribution is less than 2. The theoretical consequence is that the variance of the growth rates diverges. Nevertheless, 
we calculate the sample variance <x 2 regardless of its asymptotic divergence. The data points are rearranged so that 
the sizes S are in an ascending order. We then partition the data points into N groups with identical number of data. 
For each group, we calculate the average size S and the empirical standard deviation <x. Fig. [4^ plots cr against S for 
N - 9 for both the primary industry and the population. Power-law relations are observed for both cases, as expressed 
in Eq. ©. We find that /? = 0.21 ± 0.06 for the primary industry and j3 = 0.47 + 0.03 for the population. 
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Figure 4: (a) Dependence of the standard deviation of the growth rate with respect to the initial size in log-log coordinates for both the primary 
industry and the population. The solid lines are the power-law fits to the data with the scaling exponent /3 = 0.21 ± 0.06 for the primary industry 
and /? = 0.47 ± 0.03 for the population, (b) Dependence of the scaling exponent /? as a function of the number of bins. 



We change the number N of bins to repeat the same analysis. The value of N varies from 8 to 35 with a step of 1. 
Similar power-law relations are found between the standard deviation cr and the size S for all the N values. Fig. |4j) 
illustrates the dependence of the estimated exponent /? as a function of N. For the primary industry, /3 fluctuates 
between 0.1 and 0.25, which is consistent with previous results on the GDP and firm sales. For the population, the 
exponent/? is approximately equal to 0.5 with slight fluctuations within (0.47, 0.51). The case of the population growth 
is very interesting, which gives a nice example for the dynamic growth model of a complex organization containing 
subunits without interactions 1 12, JjJ 21]. Indeed, we can divide each county into many subunits of identical size. 
These subunits are not necessary to be administrative districts or villages. It is rational that there is no interaction 
among different subunits in the population growth. It follows immediately that = 0.5 1 12, Ijl 21]. This argument 
can be modeled using the city clustering algorithm, which leads to the conclusion that the exponent B is related to the 
long-range spatial correlations in population growth and /3 = 0.5 if there are no spatial correlations 111 711 . 



5. Conclusion 

In this work, we have investigated the growth dynamics of the primary industry and the population of 2079 counties 
in mainland China using the data from the China County Statistical Yearbooks from 2000 to 2006. We found that the 
annual growth rates are distributed according to Student's t distribution with the degree of freedom less than 2 for both 
the primary industry and the population. When each sample is divided into to subgroups according to their initial size, 
the data for each subgroup also has a t distribution. The power-law behavior in the tails of the distributions is further 
confirmed by a nonparametric approach based on the maximum likelihood estimation and the Kolmogorov-Smirnov 



test H42H , The fact that tail exponent is less than 2 means that the annual growth rates fall in the Levy regime, which is 
different from that of firms. Therefore, the growth dynamics of the primary industry and the population at the county 
level in mainland China might be different from other complex organizations studied in the literature, especially by 
the Boston School. 

We also investigated the relationship between the sample standard deviation of the growth rates and the initial 
size of the primary industry and the population, despite of the fact that the variances of the growth rates theoretically 
diverge. We observed power-law dependence between the two quantities. For the primary industry, the scaling 
exponent is less than 0.5, which is consistent with the fact that the economic systems between different counties have 
strong interactions. In contrast, for the population, no interactions between the growth dynamics of any two counties 
can be pinpointed and the scaling exponent is close to 0.5, as expected by theoretical analysis \ \2, 21]. 



In summary, we have uncovered idiosyncratic properties in the growth dynamics of the primary industry and the 
population of a developing country at the county level. Further research should be done to understand the underlying 
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microscopic mechanism causing such idiosyncracy. Many important questions are not clear that should be addressed. 
For instance, we do not know if the Chinese firms have similar growth dynamics as those in the developed countries. 
Such comparative studies will help us to better understand socio-economic systems. 
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